Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uploader: Add support for backup storage upload #60

Merged
merged 16 commits into from
Jun 25, 2024
Merged

Conversation

victorges
Copy link
Member

This implements the storage backup protocol proposed on this design doc.

This implements both the Backup URLs config as well as the logic to use them on the upload.

Some moving around was done on the core upload logic to get a clean result, without repeating
the fallback upload logic everywhere.

This was required to be able to use the ff lib for some reason.
Copied all the stuff we usually have for flag parsing
using peterbourgon/ff and an explicit FlagSet
By migrating to uploadFile it gets support
automatically too.
Primary and backup have different storage providers
so we need to move that lower layer.
@victorges victorges requested review from mjh1, emranemran, thomshutt and leszko and removed request for mjh1 and emranemran June 19, 2024 20:16
@victorges
Copy link
Member Author

Will add some unit tests tomorrow.

Copy link

@leszko leszko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Added some suggestions and comments. Other than that, I think we need unit tests, I'd have both:

  • small unit tests for functions you created (in particular buildBackupURI(), newFileProperties() (if you decide to keep it) )
  • integration test (in which you specify env variables like AWS_S3_BUCKET AWS_S3_BACKUP_BUCKET and when AWS_S3_BUCKET is not correct, then you can check it writes correctly to the AWS_S3_BACKUP_BUCKET bucket)

catalyst-uploader.go Show resolved Hide resolved
catalyst-uploader.go Outdated Show resolved Hide resolved
core/uploader.go Outdated Show resolved Hide resolved
core/uploader.go Outdated Show resolved Hide resolved
core/uploader.go Outdated Show resolved Hide resolved
@leszko
Copy link

leszko commented Jun 20, 2024

@victorges one more question, what about the catalyst-api part (for ingesting data from both buckets)?

core/uploader.go Show resolved Hide resolved
@thomshutt
Copy link
Contributor

thomshutt commented Jun 21, 2024

Agree with Rafal that this is a great candidate for integration tests

@victorges victorges force-pushed the vg/feat/backup-storage branch 2 times, most recently from 833fe94 to 34902f5 Compare June 21, 2024 21:56
@victorges victorges force-pushed the vg/feat/backup-storage branch from 4f56725 to d509819 Compare June 24, 2024 19:11
@mjh1
Copy link
Contributor

mjh1 commented Jun 25, 2024

@victorges So this is good to merge now right? I've added the backoff Reset(). I've done a little bit of local testing too, it is backwards compatible with our current usage without environment variables right? It seems to be from my testing but just to double check.

Copy link
Member Author

@victorges victorges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mjh1 Yeah it should be pretty backward compatible! I've re-reviewed this and the only difference in behavior will be that we get this log when the primary upload fails: https://github.com/livepeer/dms-uploader/blob/7eacbe24169f0f543dcbba3f9cd2edc6f496e191/core/uploader.go#L124

But the same original error from the primary is returned anyway, so the rest of the code will work the same.

I added just 1 comment inline for a potential hotfix. Feel free to merge this anyway!

outputStr := outputURI.String()
// While we wait for storj to implement an easier method for global object deletion we are hacking something
// here to allow us to have recording objects deleted after 7 days.
if strings.Contains(outputStr, "gateway.storjshare.io/catalyst-recordings-com") {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to check only for gateway.storjshare.io here in case we end up using Storj for the backup storage too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's ok, if we do use storj for backup we can set the expiry on the access keys instead, we just needed this hack because we needed it before storj implemented that.

@mjh1 mjh1 merged commit c84d101 into main Jun 25, 2024
8 checks passed
@mjh1 mjh1 deleted the vg/feat/backup-storage branch June 25, 2024 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants